We  consider  a bilinear  signal  process  driven  by  a Gauss-Markov  pro- 
cess which  is  observed  in  additive,  white,  gaussian  noise.  An  exact 
stochastic  differential  equation  for  the  least  squares  filter  is  derived 
when  the  Lie  algebra  associated  with  the  signal  process  isnilpotent.  It 
is  shown  that  the  filter  is  also  bilinear  and  moreover  that  it  satisfies 
an  analogous  nilpotency  condition.  Finally,  some  special  cases  and  an 
example  are  discussed,  indicating  ways  of  reducing  the  filter  dimension- 
ality. 
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OPTIMAL  FILTERS  FOR  BILINEAR  SYSTEMS 
WITH  NILPOTENT  LIE  ALGEBRAS 

1.  INTRODUCTION 

In  recent  years,  detection,  estimation  and  control  of  signal  pro- 
cesses represented  by  bilinear  systems  has  received  some  attention  in  the 
literature;  see,  e.g.  articles  [1],  [2]  and  [3]  and  survey  [4].  The 
principal  motivation  for  studying  this  class  of  problems  lies  in  its 
potential  applications  to  a variety  of  practical  areas  such  as  inertial 
navigation,  satellite  attitude  control  and  angle  modulation. 

We  focus  on  least  squares  estimators  in  additive,  white,  gaussian 
noise  environment.  In  [1],  such  estimators  have  been  obtained  in 
recursive  and  closed  form  under  the  assumption  that  Lie  algebras  associ- 
ated with  the  signal  process  are  abelian.  In  [3] , the  existence  of  such 
finite  dimensional,  recursive  estimators  has  been  established  under  the 
weaker  requirement  that  these  Lie  algebras  need  only  be  nilpotent;  no 
attempt,  however,  is  made  towards  displaying  the  estimator  equations 
themselves. 

In  the  present  paper,  we  derive  explicitly,  the  finite  dimensional, 
closed  form,  recursive  filtering  equations  when  the  signal  process  satisfies 
a nilpotency  condition,  thus  supplying  a complete  and  constructive  solu- 
tion to  this  class  of  problems.  In  this  process,  we  prove  that  the  filter 
is  bilinear  as  well  and,  moreover,  that  it  also  possesses  analogous 


nipotent  property.  A number  of  interesting  special  cases  are  identified 
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in  which  the  estimator  can  be  alternatively  realized  via  a linear  filter 
followed  by  a nonlinear  postprocessor,  a structure  that  may  prove 
advantageous  from  the  viewpoint  of  practical  implementation. 


In  the  next  section,  we  formulate  the  problem  and  present  some 
mathematical  preliminaries.  The  third  section  contains  the  main  theorem 
on  the  properties  of  the  optimal  filter,  and  the  final  section  deals  with 
computational  considerations. 

2.  PROBLEM  STATEMENT  AND  PRELIMINARIES 


Consider  the  following  standard  linear  Tto  models  for  the  signal 
and  observation  processes  respectively. 


The  Signal  Model: 

d£(t)  = F(t)£(t)dt  + Q1/2 (t)dw(t) , t > 0 


(2.1) 


The  Observ ation  Model : 

dz(t)  = H(t)£(t)dt  + R1/2(t)dv(t),  t > 0 (2.2) 

where,  w(*)  and  v(*)  are  standard  N and  P dimensional  independent 

N P 

Wiener  processes  respectively,  £.(t)  £ iR  , z(t)  € IR  , £(0)  is  a 

zero  mean,  gaussian  random  vector  independent  of  w(*)  and  v(*)  processes 

1/2  1/2 

and  F( * ) , Q (•)»  H(*),  R (*)  are  time-dependent  matrices  of 

appropriate  dimensions,  with  Q(t) , R(t)  positive  definite  and  continuously 


differentiable  for  all  t. 
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Our  interest  in  this  paper  centers  on  the  least  square  estimation 
of  nonanticipative,  square  integrable,  nonrandom  functionals  on  the 
Gauss-Markov  process  £(•)  definec  in  (2.1),  based  on  the  observation 
process  z(*)  defined  in  (2.2).  It  is  only  natural  to  represent  a 
functional  of  this  type  by  a dynamical  system  driven  by  £(*)• 

Clearly,  this  problem  fits  the  framework  of  the  nonlinear  filtering 
problem  discussed  by  Kushner  [5].  His  solution,  however,  requires,  in 
general,  solving  an  infinite  set  cf  coupled  stochastic  differential 
equations.  As  a result,  a number  of  approximation  schemes  have  since 
been  proposed  to  "close”  this  infinite  set,  notable  among  them:  Extended 
Kalman  filter.  Symmetric  Density  Filter,  Bass-Schwar tz  Filter,  a cumulant  dis- 
card hypothesis,  fourth  moment  assumption  and  various  other  moment  approxima- 
tions; the  interested  render  is  referred  reader  is  referred  to  Chapter 
9 of  [6]  and  reference  therein.  Each  of  these  attempts  have  experienced 
varying  degrees  of  success,  depending  on  the  specific  practical  applica- 
tion at  hand.  Besides,  they  have  often  lacked  rigorous  mathematical 
justification. 

Our  approach,  here,  is  different,  in  that  we  seek  to  "close"  the 
Kushner  equations  exactly  by  suitably  restricting  the  class  of  nonlinear 
functionals  to  be  estimated,  thus  obtaining  an  exact  solution  to  the 
restricted  problem.  Moreover,  we  should  like  this  restricted  subset  of 
functionals  to  be  also  "dense"  in  the  class  of  - functionals,  so  that 

we  would  have  essentially  "solved"  the  global  nonlinear  problem  as  well. 
Analogous  approach  has  in  the  past,  been  taken  by  Balakrishnan  [7]  and 
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more  recently  by  Huang  [8] , Chapter  V,  although  in  a "static"  framework, 
that  is,  nonrecursive  estimation  of  a single  random  variable  from  fixed 
length  data  without  the  dynamical  signal  and  noise  framework. 

In  view  of  the  above  considerations  and  some  results  available 
(see  [9]  and  [10])  on  approximation  of  nonlinear  systems  with  deterministic 
inputs,  the  class  of  bilinear  systems  seems  to  be  the  most  promising  subset 
on  which  one  would  like  to  focus  attention.  We,  therefore,  assume  that 
the  signal  process  (x(t)}t  > q,  x(t)  € IRM,  to  be  estimated,  evolves 
according  to  the  following  bilinear  dynamical  equation: 

N 

dx(t)  = Ax(t)dt  + J B .£ . (t)x(t)dt  (2.3) 

i=l  1 1 

where  A,  B^,...,  B^  are  M x M constant  matrices,  and  x(0)  is 
independent  of  £(0)  and  w(*)  and  v(*)  processes.  With  this  model,  we 
seek  a finite  dimensional  stochastic  differential  equation  for  computing 

x(t/t)  4 E[x(t)/zt]. 

Besides  the  aforementioned  mathematical  considerations,  the  above 
signal  model  has  strong  justification  on  physical  grounds  as  well.  As 
discussed  e.g.  in  [1]  and  [11],  (the  state  transition  matrices  of)  bilinear 
systems,  evolving  on  Lie  groups  can  perfectly  represent  certain  types  of 
motion  such  as  rotation  of  rigid  bodies. 


It  may  be  appropriate  at  this  point  to  recall  some  pertinent  definitions 
and  facts  from  the  theory  of  Lie  algebras  and  Lie  groups.  Further  details 
may  be  found,  for  example,  in  [12].  Let  L denote  a Lie  algebra  and 


r 
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A 


L 


e , 


the  associated  Lie  group. 


It  is  easy  to  verify  that  the  set 


L A [I,,L]  A { [a,B]  | A,R  € L} 


(2.4) 


is  an  (ideal)  subalgebra,  where  [A,B]  A AB-BA,  is  the  Lie  bracket 
operation.  Now  define  analogously  the  following  two  series  of  decreasing, 
nested  subalgebras,  recursively  as, 


Lk  A [L,Lk_1]  A { [A,B]  | A fc  L,  B£  Lk  1 } k = 2,3,...,  with  L1  = L 

(2.5) 

and 

Lk  A [Lk_v  Lk_l]  A { [A,B]  | A,B  tL^j)  , k=  2,3,...  (2.6) 

Definition  2.1;  The  Lie  algebra  L (and  the  Lie  group  G ) is  said  to  be 

a)  abelian  if  L^  = {0} 

b)  nilpotent  if  there  is  an  integer  K such  that  L = {0} 

c)  solvable  if  there  is  an  integer  K such  that  L„  = { 0 } . 

■ K. 

Analogous  definitions  can  be  made  with  respect  to  an  associative 
algebra  as  well,  merely  by  replacing  Lie  bracket  operation  by  ordinary 
matrix  multiplication  in  the  above  discussion.  It  then  follows  directly 
from  the  definitions  that  a)  =*  b)  =»  c)  and  that  for  a Lie  algebra  to 
possess  one  or  more  of  these  properties  it  is  sufficient  that  the  smallest 
matrix  algebra  containing  the  Lie  algebra  also  possess  the  corresponding 


J 


property. 
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Coming  back  to  our  estimation  problem  summarized  in  the  signal 

and  noise  models  (2.1),  (2.2)  and  (2.3),  it  was  shown  in  [1],  that  if  the 

Lie  algebra  generated  by  the  matrices  A,B^,...,  B^,  denoted  by 

{A,  B, ,...,B„L  is  abelian,  the  estimator  for  x(*)  consists  of  a 
1 N L 

linear  filter  and  a nonlinear  postprocessor.  In  [3],  the  abelian  type 
condition  was  replaced  by  a weaker  nilpotency  condition.  However,  [3] 
establishes  merely  the  existence  of  a finite  dimensional  recursive  filter. 
In  the  next  section,  we  derive  explicitly  under  the  above  Lie-algebraic 
nilpotency  condition  — the  stochastic  differential  equations  of  the  filter 
and  futhermore  explore  its  algebraic  properties  as  well.  A perliminary 
analysis  of  this  type  for  the  much  simpler  special  case  of  associate 
algebraic  nilpotency  may  be  found  in  [13]. 

3.  PROPERTIES  OF  THE  NON-LINEAR  FILTER 

We  begin  with  some  lemmas  which  will  be  heavily  used  in  the  proof 
of  the  theorem. 

Lemma  1 (Canonical  Nilpotent  Form): 

Every  nilpotent  matrix  Lie  algebra  can  be  converted,  by  a similarity 
transformation,  into  its  canonical  Lie  algebra  consisting  of  block  diagonal 
matrices  wherein  each  diagonal  block  is  triangular  with  equal  elements  on 
the  diagonal. 


Proof : See  Sagle  and  Walde  [12]  pp.  224-227. 


Lemma  2 (Exponential  Formula): 


Let  st  be  a Lie  algebra  of  matrices  with  iH 


for  any  pair  A,B  t i we  have 


as  its  basis. 


i = 1 


At  -At 
e Be 


I g,(t)H 
i=l 


where  )g.  (*)r  are  analytic  functions. 

( 1 ' i=l 

Proof : See  lemmas  (i)  and  (ii)  of  Wei  and  Norman  [14], 


(3.1) 


Lemma  3 : 


Consider  the  signal  and  observation  models  of  (2.1)  and  (2.2)  respectively. 
Define  a (vector)  process  {y(t)}^  > ^ by 


N r -.N 

dy (t ) = Dy ( t ) d t + £ (t)y(t)  where:  D,  lE^ 

i=l  ' > . 


matrices  of  appropriate  dimension,  y(0)  is 
independent  of  £(0),  w(-)  and  v(*)  processes. 


(3.2) 


^ t 

Then  y(t/t)  A E[y(t)/z  ] satisfies  the  following  stochastic  differential 


equat ion : 


dy  (t/t)  = Dy(t/t)dt  + £ E.  • EC  ' £.  (t)y(t)  jdt  + -jE1"  [y(t) f;T ( t ) 

i=l  1 1 ' 11 


y(t/t)£T(t/t)l  HT(t)R  1(t)dv(t) 


(3.3) 


v(0/0)  = E[y (0) ] 


where  E*-  [ * 1 A E[  • | zt  ] A E[*  |{z  (t)  | 0 £ t <_  t } ] 

/v 

dv(t ) = dz(t)  - H(t)£(t/t)dt. 
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Proof : Apply  the  Kushner  nonlinear  filtering  equations  15]  to  the  signal 
T T T 

process  (y  (•),  f,  ( • I ) with  z(')  as  the  observation  process. 

Lemma  4 : 


Let  x = (x  , x, , x ) be  a gussian  random  vector  with  mean  vector 
o 1 n 


T x n 

m = (m  , m.  ,...,  m ) and  covariance  matrix  P = [ P . . ] 

° 1 ^ i,j  = 0 

then  have  the  following  relation: 

Ik 

x n 

° n 


. We 


e u x.  | = S 
i=l  1 


r X 

n 

f n 

n 

+ p„l>  • 

E|e 

° n x. 

+ E e l P .E 

n x. 

i 

X 

i=2  11 

! j = 2 ^ 

i=2  liJ 

i^j 

+ Pop  E 

o 

e 

, n = 

1 . 

, n > 1 
(3.4) 


Proof : We  shall  indicate  only  the  main  steps  of  the  proof  leaving  the  purely 
algebraic  manipulations  to  the  reader. 


, x n 

eL  ° n x. 
1 i=l  1 


x n 

( 2tt  | P | ) (n+^/2exp  - -|(x-m)TP  ^(x-m)-  e ° n x^  dx 

i=l 

IR  i (where  |*|Adet(*)) 

m + — — r n 


= (2u|P|r<n+1)/2  • e ° 2Qo° 


n x. 

• T 1 
+1  1=1 


n 


exp  - i[(xo  - mo  - ^-)2  • Q00  + I Qij(xi  - rai)(xj  " mi)]  dx 

OO  l»J-t> 

i=j^O 

(upon  completing  the  square  for  the  term  i = j = 0 and  using  the  notation 

Q..  A i,jt^  element  of  P ^ A Q) » 
i.l  = 


/ , In  \ -(n+1) 

(m  + ;rP  ) ~ ' 

■ ° 2 00  • (2'T  I P I ) 2 


n If  12 

dx  H x exp  - (x  - m - - — ) Q 
. . 1 2[  o o 0 c 

, i 1“1  00 


+ J.„  V*.  - *.  - + J.0  ”i»V  <£> 

i=0,j^0  i^O, j=0 


+,  LVv  W Vr  cxpi'j,| 

■L  * J 1—  1 OO 


P . c . 

v ox  01 

V - P • ~c 

00  00 


(where,  C..  A cof actor  of  P..) 

t.l  ij 


(n  + ~P  ) - -j,  n 

= e |2irP*|  H x.  exp 

i=l  x 
Rn 


i[(x*“  m*)Tp*J  (x*-  ">*) 


n C . 


11  ^ 

exp[ .1  IpTT  (Poi  “ xi  + miTx* 


i=i 


(integrating  with  respect  to  x , using  the  notation  P.  A fp..j 


x.  = (x,,...,x  ) and  the  facts  that 
x J n 


integers  i,j,  i',  j'  and  C =|P  I, 

OO  ' * 1 


Q,,  c,. 


Q ' ' C ' ' 

1 j i j 


for  choice  of 


(m  + ^P  ) 
o 2 00 


11 

* I 27TP*  I 2 [iSi-i^P  - m*  - p*)Tp_1(x*-  m*-  P*)  ] dx 


7\ 

(combining  the  exponents,  and  noting  the  fact  that  C „ = Y P C , with 

°2  >1  oj  lj 

★ & J “ 

with  C. . A cofactor  of  P..  in  P..  Also  P A fP  , P P lX) 

ij  ij * o ol  o2  on 

We  have  thus  arrived  at  the  following  conclusion: 


x n 

E e ° II  x. 
i=l  1 


(in  + ~P  ) ( n 

• e 0 2 00  • 1.  n y. 

h-i  ll 


(3.5) 


where  Y = (Y^.-.Y^)  is  a gauss ian  random  vector  with  mean  vector 

7 

(m,  + P .,...,  m + P ) and  P.  as  the  covariance  matrix, 
i oi  n on  * 
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Now,  a standard  moment  theorem  for  gaussian  vectors  (see  e.g.  [15])  yields 


E 


Ji  Y . 
1=1  1 


(n‘l  + Pol)E 


n y. 

i=2  1 


(3.6) 


Combining  (3.5)  and  (3.6),  and  using  repeatedly  identity  analogous  to 
(3.5)  for  vector  x with  reduced  dimensionality  gives  (3.4),  completing 
the  proof. 

Q. E.D. 


We  are  now  in  a position  to  state  and  prove  the  main  theorem  of 
this  paper. 


Theorem  5:  Consider  the  signal  process  {x(t)}^  > ^ evolving  according 
to  (2.3)  and  (2.1)  and  the  observation  process  {z(t)}^  > as  in  (2.2). 
Suppose  that  the  Lie  algebra  si  generated  by  the  set  of  matrices 

{AdJj(B.)  j i = 1,2,...,  N;  K = 0,1,2,...} 

•k 

is  nilpotent,  with  dimension  N and  order  of  nilpotency  N*,  where 


AVV  A h 


and , for  k = 1,2, . . . 


Ad ^ ( B . ) A A • Ad^-1(B.)  - Ad^_1(B.)  • A . 
A l A l A l 


Then  the  least  squares  filtered  estimate 


x ( t / 1 ) A E 


x(t)/{z(x)  } 

0 < 


can  be  obtained  from  the  finite  dimensional,  bilinear,  stochastic 
differential  equation  of  the  following  form: 


< t 


11 


dx  (t/t)  = 


* N * N * r* 

A (t)dt  + l B (t)£.(t/t)  + l C. (t)db . (t)  x (t/t). 


i=l 


i=l 


^ * M >V 

x (•/•)  t |R  , M < M 


2 * M 
(N  N ) -1 

2 * 

N N - 1 


-*  j E [x . (0) J , i < M 

x. (0/0)  = \ i 
1 0 , i > M 


x(r/c)  = L(t ) x (t/t) 
where 


P (t)  A 


T “1 

H (l)R  (x)dv(t) , the  modified  innovations  process,  j 


0.7) 


A 

where,  £(•/•)  is  obtained  from  the  standard  Kalman-Bucy  filter  (see 
^ * * 

)N  * * 

•)  )■  are  M x M 


•k  * 

[16]),  L(«)  is  an  M x M and  A (•),  >]0( 

1 1=1  1 


i=l 


matrix  valued  deterministic  time  functions  such  that  the  Lie  algebra  d 
generated  by  tiie  set  of  matrices 


Adk  ( B V ( t ) ) 1 = 1,2,. ..,N;  ] = 1,2,.  . . ,N;  k,£,  = 0,1,...;  t > 0 \ 

l * 1 3 I 


l 


Ad  (A  (t)) 

* 

C.  (t) 


is  nilpotent  with  as  the  order  of  nilpotency 

In'”' 


Proof : Let  { H 


be  a basis  of  d and  d the  canonical  form  of  d . 

^1=1 

Hence,  by  Lemma  1,  there  exists  a (nonsingular)  matrix  S such  that  the 
& 

set  (hT  with  H*  A SH.S_1,  i - l,...,  N*  is  a basis  of  d and 

I lIi=l  1 1 

II*  = diag  jlH*,  V',...,  V}  for  all  i = 1 N*,  where,  each 


di 


agon 


. ■ l 

a l block  ^H*  , i = 1 N*,  k = 1 9.  is  M * M,  j M = M, 

1 k=  I k 


and  of  the  following  form: 


k k l<  lt._  k 

H . = h . . I + V 

i n M,  i 

k 


ii 


£ |R 


. y 


stric tly  upper  triangular . (3.8) 


Now,  consider  the  transformation 


y(t) 


Se 


-At 


x(t) 


(3.9) 


Using  the  above  observation  and  Lemma  2,  the  transformed  version  of  (2.3) 
can  be  written  as, 

* 


dy  (t) 


- * * 
l H Ut) 
i=l 


y (t)dt 


(3.10) 


k k 

where,  K (t)  = D(f)C(t),  and  D(’)  is  a deterministic  N * N matrix 
valued  (analytic)  time  function.  But,  note  that  the  dynamical  system 
(3. 10)  is  in  a "decoupled"  form  and  hence  is  the  direct  sum  of  the  £ 
"subsystems" 


dyk(t)  = I l kH*£*(t) 

4=1 


yk(t) , 


where,  y (•)  is  the  vector  as  follows 


(3.11) 


k 

y 


+1’  yM  +2’ 
k-1  k-1 


T 


y\-!+  V 


(with  A 0) . We  have  thus 


effectively  segregated  the  filtering  problem  into  £ independent  subproblems 
Hence  (except  for  a possible  increase  in  filter  dimensionality)  we  may 
assume  without  loss  of  generality  that  £ = 1 , so  that  the  original 
system  has  no  nontrivial  subsystems  beside  itself. 
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Now,  apply  the  filtering  equation  (3.3)  of  Lemma  3 to  the  system 
(3.10)  with 


, . k.  . k 

hk  h12 hlM 


\ **. 

\k 

Vl,M 


\ 


, k - 1 , 2 N 


(3.12) 


First  note  that  the  solution  to  this  system  (for  O >_  0)  can  be  explicitly 


written  as  * 

(-  eC0(°0)  . y (0) 

q 


yq<e0)  = 


q = M 


M u-q  N 

ill  i 

(u=q+l  s=l  m^  , . . . , mg=  1 q = i^  < .i  1=i2<  ' ' j s=u 


y (0) 
u 


u 

° 

rul  r M-q-1  s 

r mi  * 

...|  n 

1 n=  1 

h.  ,.  £ (a?)do. 

1 ±%  jfc  mg,  z > 

, q < m 


(3.13) 


where 


k 

e (o) 

O o 


* 

N rO 


k=l  ' 


o * 

Ck(T)dT. 


(3.14) 


0 


Now 


using  identify  (3.4)— with  conditional  expectations— of  Lemma  4, 


application  of  (3.3)  to  (3.10)  yields: 


r 


14 


N N N 


dy (t/t)  = 


[HB(t)Ut/t)  +P  (t)ly(t/t)dt  + l l l (t)y 

j = l J J ° k=l  1=1  j = l K,1,J 


• (k,  i , j ) 


N'  M „ (l,i,j) 

+ 7 V D. . (t)y(t/t)  + P (t)v(t/t)  , 

1=1  j=l  11  ° 


N N „ ( 2 , 1 , j > 

y y d . (t)y(t/t) 

i=L  j = l 1J 


I N*  N „(N,i,j) 

+ Po(t)y(t/t)j,**«  y £ . (t)y  (t/t)  + P^(t)y(t/t) 

i=l  j=l  13 


HT(t)R  1 (t)dv(t) 


y (0/0)  = E [y (0) ] , 


(3.15) 


f B N 

where,  the  deterministic  matrix  valued  time  functions  <!!.(•)•  . p0(*) 

( J 1 j =1 

and  (h^  .(•)  Ik  = 1,...,  N*;  i,j  = 1,...,nI  belong  to  the  linear  manifold 

k’X’3  ^N*  ’ 

^|h*|  N*  generated  by  the  set  jH*j  and  can  be  computed  from  knowledge 

1 1 3 


f * ) N* 

V generated  by  the  set 


and  can  be  computed  from  knowledge 


of  the  covariance  matrix  P(«)  of  the  £(•)  process  and  the  matrix 
D( •)  A [D . . ( - ) ] . Further,  the  "supplementary  state  vectors" 


r(k,i,j^)  | j>k  = i = l,...,N*l  appearing  in  (3.15)  are  defined 


as  follows: 


,,  . .v  (k,i,j)  (k,i,j) 

y (k,i,j ; (t)  = yi(t)  # y2(t)  ,, 


(k,i,  j ) ]r 

/ *-  \ 


* yM(t) 


(k,i,j) 

y (0  ) 

(j  o 


0,  q = M 

M u-q  N* 

l l l l 

u=q+l  s=l  m j , . . . , mg=  1 q=ij  < i]  = l2”*<  l.s=" 


M;q  W f°o  f°l  , ™r 

l e yu(0)  •••]  Pii(ao’ar)hi  , j 

0 0 0 r r 


rO  rO , Ca, 


(3.1b) 


do  n 
j.= 


n h.  ,.  F (o„)do0  , 

t-  ll  h mi  l> 

it  r 

q < M 


X 


A 
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and  we  have  used  the  well-known  fact  that  the  conditional  covariance 
matrix  given  by 

P (a,  t)  A e[(£(o)  - €(o/t)(£/a)  - i(a/t)T/ztj  (3.17) 

is  nonrandom  for  O < t. 


A direct  differentiation  now  reveals  that  the  vector  of  augmenting 
states 


Ly(-)  A 


(1,1, 1)T  (1,1, 2)T 
y(*)  y(*) 


* T T 7 * 

(N,N  ,N)  ] (MN  N ) 

y(*)  t (R  (3.18) 


satisfies  a differential  equation  of  the  following  form: 


d1y(t)  = 


y(0)  = 0 


aX(t)  + l y*(t)£  (t) 


i=l 


i i 


1y(t)dt  + 3^(t)y (t)dt 


(3.19) 


where  we  have  used  the  formula  (see  e.g.  [17])  frequently  used  in  fixed 
point  smoothing,  viz; 


= ^F(t)  " p<t)H'r(t)R'  1(OH(t)]p(t,o) 
and  the  M x M blocks  of  matrices  aF(*),  8^(‘)  and 


(3.20) 


belong 


respectively  to  the 


% 1^,  T^H*.  tJhJJ*  | , |tJhJ,tJh 


^{HiTr  h2ti*  ■ * ■ ♦ V1?}’ 


and 
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l’n  A 

1 -- 


r . l 

l . 

0 

n-i  « 

1 

1 

1 

— 

1 

0 ‘ 

0 

1 < i < n 


(3.21) 


f )N 

Furthermore,  1 V*)[  are  block  diagonal  matrices  with  identical 

diagonal  blocks  given'bj  H®(-)t”,  i = 1, . . • , N.  For  future  reference, 
we  also  note  that  since  A 0,  ^(*)  C 77,  |ll . ( • )Tj|  • 

We  may  now  again  apply  lemmas  3 and  4 to  (3.19)  and  obtain  the  following 

1^ 

differential  equation  for  y(t/t). 


d y(t/t)  = 


,(k,i,j) 


| ot1  ( t ) + i Y®(t)q(t/t)  + po(t)]i;(t/t)dt 

1 i=l 

N n"  N ^(L,*,_ 

+ BL(t)J(t/t)dt  + l l I Yk  i ,(t)  y(t/t) 

k=i  i=i  j-l  K’ 

r N*  N ^(l,i,j) 

+ P (t)1y(t/t)  + l l D..(t)iy(t/t)  , P (t)  y(t/t) 

l o i=1  j=1  rj 


N*  N 


(2 , i , j ) 

+ l l D..(t)  y(t/t)  ,...Po(t)  y(t/t) 
i=lj=l 


N*  N (N,i,j), 

+ j l D.. (t)  y(t/t)  JH  (t)R  (t)dv(t) 


i-1  J-l 


1 


y (0/0)  = 0 


(3.22) 


The  "new"  set  of  supplementary  states  appearing  in  (3.22)  is  in  turn  given 
by 


L7 


rk  i n f (k,i, j )T  (k,i,j)T 
'yM  'J  A |y^l’l,1\t),  y'1-1’2'^) 


(k,i,  j )T-iT 

(N,N*,N) 


(k,i, j) 

y(k’ ,x' ,j ')(t)  A 


(k,i,j)  (k , i , j ) 

yl<k,'1^fe.y2(k,-r-J> 


II 

....  N;  i = 1... 

(k, i , j ) 

'(t), 

’’  VM  V 

(L) 


(k,i , i) 


q > M-l 


M_q  0°o) 


l 

' 2 

r,  ¥ r. 


rl,r2  ” 1 


y (o) 
u 


u-q 

N 

y 

L 

S=1 

L 

m ^ y • • • 

rO 

0 

V. 

. 

M-q-1 


0 0 0 


= i.  ,<•  • j =u 


VV  °rI)Pi'J'(ao-  0r2) 


m = k 
rl 

m = k' 
r2 


lr  Jr 
rl  1 


h. 

1 


mn 


, . da  ,do  n 

r _ Jr9  rl  r2  *=1 
2 2 Wrvr2 


h.  £ (o0)dO 


(3.23) 


Now,  the  "new"  augmenting  state  vector 


y(t)  A 


(1,1,D 


(1,1,2) 


(N.N  ,N) 


(t),  y 


(t)  , . 


(t) 


4 *2 

T MN  N 

t IR  (3.24) 


is  easily  seen  to  satisfy  the  following  differential  equation  analogous  to 
O.  19) 

N 

d2y(t)  = jot2  ( t ) + l y2(t)£. (t) |2y(t)dt  + 02  (t)  \v (t )dt , 2y(0)  = 0 


(3.25) 
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2 2 ( 2 )N 

where  the  M * M blocks  of  matrices  a (•),  8 (•)  and  lY.(*)r 

*•  1 ' i=l 

now  belong  to  ft  jl^,  T^H* T2^**},%{t^H*,  T^H* T^H  *j  and 


„M 


^|hiT^,H2T^1 H ,*T2  [ ■ A1so  Yt(‘),  i = 1 N are  diagonal  with 

B M A [ E m! 

blocks  given  bv  H.(‘)T,  , and  y . . (•)  (*)T  V 

J - t.h]  v J -'.1  = 1 


It  is  now  clear  how  the  above  process  can  be  iterated.  At  exactly 

M-l 

t"  Vi 

the  application  of  Kushner  equations  we  find  that  Y = 0, 

i = 1,2,...,  N so  that  no  new  supplementary  states  appear,  resulting  in 
closing  the  chain  of  coupled  nonlinear  filtering  equations.  Define 


* [ T i T (m_1)t 

X (t)  A y(t)  y (t) , . . . , y (t) 


T * 

. We  see  that  the  dimension  of  x (•) 


T 
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N 

(iii)  \ B . V are  block  diagonal  — hence,  in  nilpotent  canonical  form  — 
1 ' i=l 


while  *s  ^lock  triangular  with  each  M x M block  being 

a multiple  of  T^l  , 1 < I < M . 

T M 

Keeping  in  mind  the  above  observations  and  carrying  out  the  Lie  bracket 

£ * 

operations  blockwise,  we  find  that  the  matrices  A„  .(t)  A Ad  . (A  ( t ) ) , 

' C.(t) 

* 

i = 1,...,  N;  £ = 0,1,...  inherit  all  the  properties  of  A (t)  noted 
above  so  that  all  the  M x M block  of  matrices 

B.  . „ .(t)  A Adk  (t)  (B*(t) ) , k = 1,2,...;  j=  1,...,  N are  strictly 
k,jA,i  7 J 

i * I N 

triangular.  SinceJ  as  noted  above,  are  themselves  in  nipotent 

1 ’ i=l 

canonial  form,  the  desired  conclusion  can  be  verified  simply  by  carrying  out 
blockwise  the  Lie  bracket  operations  required  in  the  definition  (2.1b)  of 
nilpotency . 

Q.E.D. 


We  thus  see  that  the  optimal  filter  structure  (3.7)  is  similar  to 
that  of  the  signal  model  (2.5)  in  that  (i)  it  is  bilinear  in  both  drift 
and  diffusion  terms,  and  furthermore  (ii)  it  also  possesses  the  nilpotency 
property  of  (2.5).  This  behavior  is  analogous  to  that  of  linear 
filtering  problems  in  which  a linear  signal  model  gives  rise  to  the 
optimal,  filter  which  is  also  linear  in  both  the  drift  and  diffusion  terms. 

The  above  structural  features  notwithstanding,  it  seems  unlikely  that 
the  state  spaces  of  (2.5)  and  (3.7)  will  be  Identical  nilpotent  group 


J 


manifolds.  This  is  obviously  undesirable  from  a practical  standpoint. 
One  way  to  remedy  this  situation  might  be  — rather  than  least-squares  — 
to  look  for  error  criteria  themselves  defined  on  such  manifolds.  Such 
an  approach  for  signal  processes  evolving  on  abelian  Lie  groups  was 
followed  in  11]. 

4.  COMPUTATIONAL  CONSIDERATIONS 

Realization  of  the  filter  (3.7)  in  the  form  of  a block  schematic 
is  shown  below  in  figure  1.  The  practical  significance  of  the  bilinear 
property  of  the  above  filter  is  that  on-line  microprocessor  implementa- 
tion of  the  filter  is  still  possible  with  easily  available  and  cheap 
hardware  consisting  of  integrators,  summers  and  multipliers.  This 
is  especially  important  in  view  of  the  obviously  huge  dimensionality  of 
this  filter. 


Figure  1 


Block  Schematic  of  the  Optimal  Nonlinear  Filter 


z(.) 


K - B 

[ Mll_*. 

Bilinear 

filter 

f 

system 

m(  •) 

l 

f 

, - . 

Riccati  Equation  | 

Solver 

1 

• ^ L 

X ( ./  •)  - 


K 

x(  •/  •) 
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The  following  example  illustrates  the  optimal  filter  (3.7)  — for 
i specific  choice  of  (2.3)  — obtained  by  applying  the  alogrithm  developed 
In  the  proof  of  theorem  5. 


Example  6;  M = 3,  N=  2,  A = 0 


B.  = 

l 


o 


0 

0 


o 


0 


i = 1,2 


(4.1) 


Observe  that  with  this  choice,  the  system  (2.3)  is  already  in  the 
canonical  decoupled  form  as  given  by  (3.11)  and  (3.12),  and  hence  we 
may  take  — following  the  notation  of  the  proof  — D(*)  A 

'k 

r (•)  A £(*)  and  y(*)  A x(*).  This  simplification  permits  a vast 

k 

reduction  in  the  filter  dimensionality  as  follows.  Since  N = N =2, 

k 

M = 3 we  have  M = 219  having  required  three  applications  of 

Kushner  equations.  But  in  this  case  D„(*)  A I so  that  the  number 

2 

of  resulting  augmenting  states  can  be  reduced  by  a factor  of  2 = 4, 

by  combining  them  as  follows.  Define 


yk(t) 


l y(k,i,j)(t),  k = 1,2. 
i=j  = l 


(4.2) 


ind 


k,k'  , , 

y (t) 


2 2 

I I y 

i-j-1  i'-j'-l 


(k,i,j) 

(k’,i,,j') 


(t),  £ = 1,2 


(4.3) 


Tins  yields  M = 21  dimensional  filter  with  the  following  coefficient 
matrices : 
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and  finally 


L(L)  = 


3 x 21 


(4.7) 


where,  the  notation  used  is  as  follows: 

P..(t),  t..(t):  i,jC^  elements  of  matrices  P(t)  and  [F(t)-P(t)H  (t)R  H(t) 
respectively 


<$(•):  KrOnecker  delta  function. 


and  for  any  n x n matrix  II, 

1U  A T^U  and  U1  A UTn  (4.8) 

l ~ l 

Observe  the  repetition  of  the  and  6*"^  block  rows  as  well  as  the 
all  zero  row  numbers  6,  9,  11,  12,  14,  15,  17,  18,  20,  21  in  the  above 
example,  so  that  the  filter  dimensionality  is  in  effect  reduced  to  10. 

This  observation  can  be  generalized  in  a straightforward  way,  and  it 
follows  that  the  filter  dimension  in  effect  can  be  reduced  to 


M-l 

2 * i 

(N  N ) + i - 1 

l 

M-i 

l 

i=0 

i J 

1 

Despite  the  possibility  of  improvements  of  the  above  type,  the 
practical  implementation  of  such  filter  may  at  times  prove  formidable. 

We  conclude  this  paper  by  pointing  out  three  nontrivial  cases  in  which 
the  bilinear  filter  presented  here  collapses  into  an  easily  implementabl e , 
nonlinear,  memory less  postprocessor,  as  shown  in  figure  2. 
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Figure  2 

-v 

x(t/t) 

( i ) with  Output  Nonlinearity: 

Suppose  the  components  of  the  x(*)  process  to  be  estimated 
are  multilinear  forms  in  the  components  of  a linear  system  driven 
by  the  £(*)  process  of  (2.1).  Systems  of  this  type  are  frequently 
of  interest  in  realization  theory  (see  e.g.  [18])  as  they  serve 
as  good  models  for  a wide  class  of  nonlinear  processes.  Since 
all  (conditional)  moments  of  a multivariate  gaussian  distri- 
bution are  completely  determined  by  its  (conditional)  mean 
vector  and  (nonrandom  conditional)  covariance  matrix,  it  is 

A 

easy  to  see  that  x(t/t),  t > 0 can  be  obtained  as  in  figure  (2). 

It  can  also  be  checked  by  direct  differentiation  that  the  x(*) 
process  satisfies  a bilinear  dynamical  equation  with  the  nilpotent 
Lie  algebra  as  in  theorem  (5). 

(ii)  Abelian  Systems: 

Suppose  that  the  matrices  A,  B , in  (2.3)  commute. 

It  is  easily  verified  that 
N ft 

I B.  j S.OOdT 

x(t)  = e*  * 0 • eAt  x(0).  (A. 9) 

The  desired  filter  structure  now  follows  upon  utilizing  the  gaussian 
characteristic  function  formula.  (See  also  [1]  and  [19].) 


i Linear 
Filter 


Memoryless 

Nonlinearity 
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(iii)  Single  Input  Systems; 

In  (2.3)  let  N = 1 and  let  {A,B}l  be  nilpotent.  If  we  use  the 
canonical  form  of  Lemma  1,  it  is  not  difficult  to  see  that  each 
component  of  x(*)  can  be  written  as  a finite  sum  ot  terms  of  the 


biP)dl[l  r°i  rv i 

yQeat  • e 0 •••  £(<*]_)£  (o^  • • • fJ(of)da;L**‘  da£  , 


0 0 0 


where  y is  a random  variable  independent  of  £(*)•  But  the  £ fold 
o 

integral  in  the  above  expression  can  be  replaced  by  j-,  (/^(cOdoj  . Thus 
case  (iii)  is  roughly  a combination  of  cases  (i)  and  (ii). 


For  the  sake  of  completeness,  we  record  some  formulas  useful  in 
solving  the  above  three  cases.  Let  Y(t) A /t^(t)dT  . Then, 


EfeY(t)Y£(t)/ztj  = e 


(Y(t/t)  + \ a2(t)) 


m£  (Y(t/t)  + a2(t),  02(t) 


(4.10) 


where  a ( t):  Nonrandom  error  covariance  (computed  via  a Riccati 


equation)  and  m£(iq,v):  £ moment  of  a gaussian  random  variable  with  mean 


r)  and  variance  v. 


Furthermore,  m£(n,  v) , £ = 2,4,6,. 
(see  e.g.  [20],  pp.  159-162) 

3m..<n'v>  _ tsty  . . 

3v  2 m2-2^T1,v 


may  be  recursively  computed  via. 


L. 


mp(n,  0)  = n • 


(4.11) 


:>a 
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